Let’s read the dataset and remove the samples containing less than 8522 reads:
## [1] 47144 130
## OTUId st001 st002 st003 st004
## 1 OTU_1 469 187 240 135
## 2 OTU_2 445 389 760 819
## 3 OTU_3 1030 2196 1611 453
## 4 OTU_4 135 66 36 365
## 5 OTU_5 0 178 285 175
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## [1] 47144 129
## st001 st002 st003 st004 st005
## OTU_1 469 187 240 135 184
## OTU_2 445 389 760 819 1003
## OTU_3 1030 2196 1611 453 512
## OTU_4 135 66 36 365 146
## OTU_5 0 178 285 175 49
## [1] 47144 7
## SILVA_taxonomic_classification
## OTU_1 KC488417.1.1688_Eukaryota;SAR;Alveolata;Dinoflagellata;Dinophyceae;Gymnodiniphycidae;Gymnodinium_clade;FV18-2D9;uncultured_dinoflagellate
## OTU_2 FJ832119.1.1585_Eukaryota;SAR;Alveolata;Protalveolata;Syndiniales;Syndiniales_Group_I;uncultured_marine_picoplankton
## OTU_3 EF172960.1.1674_Eukaryota;SAR;Alveolata;Protalveolata;Syndiniales;Syndiniales_Group_II;uncultured_eukaryote
## OTU_4 KC488405.1.1558_Eukaryota;SAR;Alveolata;Dinoflagellata;Dinophyceae;Gymnodiniphycidae;Gymnodinium_clade;Erythropsidinium;uncultured_dinoflagellate
## OTU_5 EU793901.1.1663_Eukaryota;SAR;Alveolata;Protalveolata;Syndiniales;Syndiniales_Group_I;uncultured_syndiniales
## MAS_taxonomic_classification BM_taxonomic_classification
## OTU_1 EU793855_Dinophyceae_Alveolata 9492_Dinophyceae_Alveolata_15
## OTU_2 EU793773_MALV-I_Alveolata 3261_MALV-I_Alveolata_385
## OTU_3 EU793316_MALV-II_Alveolata <NA>
## OTU_4 EU780636_Dinophyceae_Alveolata 21474_Dinophyceae_Alveolata_6
## OTU_5 EU818388_MALV-I_Alveolata 398_MALV-I_Alveolata_395
## SILVA_class MAS_class BM_class MAS_plus_BM_plus_SILVA_class
## OTU_1 Dinophyceae Dinophyceae Dinophyceae Dinophyceae
## OTU_2 <NA> <NA> <NA> <NA>
## OTU_3 <NA> <NA> <NA> <NA>
## OTU_4 Dinophyceae Dinophyceae Dinophyceae Dinophyceae
## OTU_5 <NA> <NA> <NA> <NA>
## [1] 47144 122
## st001 st002 st003 st004 st005
## OTU_1 469 187 240 135 184
## OTU_2 445 389 760 819 1003
## OTU_3 1030 2196 1611 453 512
## OTU_4 135 66 36 365 146
## OTU_5 0 178 285 175 49
## st054
## 4050
## [1] 47144 121
## [1] 47144 117
## [1] 47144 0
Table dimensions and content outline:
## [1] 47144 117
## st001 st002 st003 st004 st005
## OTU_1 469 187 240 135 184
## OTU_2 445 389 760 819 1003
## OTU_3 1030 2196 1611 453 512
## OTU_4 135 66 36 365 146
## OTU_5 0 178 285 175 49
Minimum number of reads per station:
min(colSums(tb18_tax_occur_min8522))
## [1] 8522
Maximum number of reads per station:
max(colSums(tb18_tax_occur_min8522))
## [1] 936570
Identification of station with higher number of reads:
amplicons_per_sample<-colSums(tb18_tax_occur_min8522)
amplicons_per_sample[which(colSums(tb18_tax_occur_min8522)>900000)]
## st131
## 936570
Overall reads per sample:
Let’s normalize the original dataset by randomly subsampling 8522 reads in each station:
library(vegan)
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.3-1
tb18_tax_occur_min8522_t<-t(tb18_tax_occur_min8522)
tb18_tax_occur_ss8522<-rrarefy(tb18_tax_occur_min8522_t, 8522)
The normalized table shows the following dimensions and format:
## [1] 117 47144
## OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001 26 25 49 8 0
## st002 23 44 241 3 24
## st003 34 131 237 2 48
## st004 37 189 94 102 31
## st005 24 141 76 32 8
Its content fits with the expected normalization values (8522 reads per station):
rowSums(tb18_tax_occur_ss8522)
## st001 st002 st003 st004 st005 st007 st009 st010 st012 st013 st014 st015
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st017 st018 st019 st022 st023 st024 st025 st026 st027 st028 st029 st030
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st032 st033 st034 st035 st037 st038 st039 st040 st041 st043 st044 st045
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st046 st047 st049 st050 st051 st052 st053 st055 st056 st057 st058 st059
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st060 st062 st063 st064 st065 st066 st067 st068 st069 st070 st071 st072
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st073 st074 st075 st076 st077 st078 st079 st081 st082 st083 st085 st086
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st087 st088 st089 st092 st093 st094 st095 st096 st097 st098 st101 st102
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st103 st104 st106 st107 st108 st109 st110 st112 st114 st115 st116 st118
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st119 st120 st121 st125 st126 st127 st128 st129 st130 st131 st132 st133
## 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522 8522
## st134 st135 st138 st139 st140 st141 st143 st145 st146
## 8522 8522 8522 8522 8522 8522 8522 8522 8522
Let’s check out how many OTUs don’t appear in the new table:
length(which(colSums(tb18_tax_occur_ss8522)==0))
## [1] 20431
There are 20487 OTUs that don’t show any occurrence in the normalized data. Let’s remove them from the table and take a look at its final dimensions:
tb18_tax_occur_ss8522_no_cero<-tb18_tax_occur_ss8522[,-(which(colSums(tb18_tax_occur_ss8522)==0))]
dim(tb18_tax_occur_ss8522_no_cero)
## [1] 117 26713
Datasets summary: dim(tb18_tax) –> 47144 129 dim(tb18_tax_occur) –> 47144 122 dim(tb18_tax_occur_ss8522_no_cero) –> 117 26657
Most of the samples take Shannon Index values between 5.5 and 6:
Lowest number of OTUs per sample:
## [1] 168
Maximum number of OTUs per sample:
## [1] 1986
In most of the samples, we can identify between 1000 and 1300 OTUs:
plot(sort(OTUs_per_sample_18S_tax_occur_ss8522), pch=19)
boxplot(OTUs_per_sample_18S_tax_occur_ss8522, pch=19)
pielou_evenness_18S_tax_occur_ss8522 <- tb18_tax_occur_ss8522_div/log(OTUs_per_sample_18S_tax_occur_ss8522)
The Pielou index (constrained between 0 and 1) takes values closer to 1 as the variation of species proportion in a sample decreases. Most of the samples get values between 0.8 and 0.9, meaning that the numerical composition of different OTUs in a sample is highly similar:
plot(sort(pielou_evenness_18S_tax_occur_ss8522), pch=19)
boxplot(pielou_evenness_18S_tax_occur_ss8522, pch=19)
The OTU_2, with 27292 reads, is the most abundant in the overall dataset:
head(sort(colSums(tb18_tax_occur_ss8522_no_cero), decreasing=T), n=10L)
## OTU_2 OTU_6 OTU_23 OTU_8 OTU_25 OTU_27 OTU_1 OTU_12 OTU_9 OTU_7
## 48439 35256 12264 11760 10044 9850 8568 8022 7937 7600
Most of the OTUs show very few occurrences; the plot suggests that we will probably be able to identify a significant ammount of rare otus:
plot(log(sort(colSums(tb18_tax_occur_ss8522_no_cero), decreasing=T)), pch=19)
The OTUs abundance distribution fits relativelly close to log-normal model.
According to Preston’s lognormal model fit into species frequencies groups, we’re missing ~3226 species:
tb18_tax_occur_ss8522_prestonfit<-prestonfit(colSums(tb18_tax_occur_min8522_t))
plot(tb18_tax_occur_ss8522_prestonfit, main="Pooled species")
veiledspec(tb18_tax_occur_ss8522_prestonfit)
## Extrapolated Observed Veiled
## 49081.293 46593.000 2488.293
When computing Prestons’ lognormal model fit without pooling data into groups, we seem to miss ~3055 species:
## Extrapolated Observed Veiled
## 48957.844 46593.000 2364.844
The Bray-Curtis dissimilarity, constrained between 0 (minimum distance) and 1 (highest dissimilarity) allows us to quantify the differences between samples according to the composition and relative abundance of their OTUs. In our dataset, most of the samples pairs take dissimilarity values between between 0.7 and 0.8, meaning that their composition is substantially different.
The only relatively evident cluster we can distinguish in the dendogram stands out in the very left side of the plot.
(To be done: assign Longhurst provinces information to each station and check if any of the central clusters is meaningful regarding to the samples’ geographical ubication)
We can identify a prominent group in the central part of the NMDS plot and a few outliers in the middle-left edge of the plot. The stress parameter takes a value below 0.3, suggesting that the plot is acceptable.
##
## Call:
## monoMDS(dist = tb18_tax_occur_ss8522_no_cero.bray)
##
## Non-metric Multidimensional Scaling
##
## 117 points, dissimilarity 'bray', call 'vegdist(x = tb18_tax_occur_ss8522_no_cero, method = "bray")'
##
## Dimensions: 2
## Stress: 0.2007517
## Stress type 1, weak ties
## Scores scaled to unit root mean square, rotated to principal components
## Stopped after 91 iterations: Stress nearly unchanged (ratio > sratmax)
When implementing a most robut function for computing NMDS plots, the result is quiet the same:
## Run 0 stress 0.2006672
## Run 1 stress 0.2006664
## ... New best solution
## ... procrustes: rmse 0.001004021 max resid 0.007828648
## *** Solution reached
## Warning in ordiplot(x, choices = choices, type = type, display = display, :
## Species scores not available
Working datasets:
dim(tb18_tax_occur_ss8522_no_cero)
## [1] 117 26713
tb18_tax_occur_ss8522_no_cero[1:5, 1:5]
## OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001 26 25 49 8 0
## st002 23 44 241 3 24
## st003 34 131 237 2 48
## st004 37 189 94 102 31
## st005 24 141 76 32 8
## NULL
tb18_tax_occur_ss8522_no_cero.bray<-as.matrix(tb18_tax_occur_ss8522_no_cero.bray)
dim(geo_distances_MP_18S)
## [1] 117 117
Communities quickly change their composition across geographical distances:
plot(geo_distances_MP_18S, tb18_tax_occur_ss8522_no_cero.bray, pch=19, cex=0.4, xlab="Geopgraphical distances", ylab="Bray-Curtis dissimilarities")
Mantel statistic is -significantlly- so low, meaning that the correlation between samples dissimilarity and geographical distances is weak.
mantel(geo_distances_MP_18S, tb18_tax_occur_ss8522_no_cero.bray)
##
## Mantel statistic based on Pearson's product-moment correlation
##
## Call:
## mantel(xdis = geo_distances_MP_18S, ydis = tb18_tax_occur_ss8522_no_cero.bray)
##
## Mantel statistic r: 0.1172
## Significance: 0.001
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0232 0.0296 0.0377 0.0447
## Permutation: free
## Number of permutations: 999
Maximum distance between samples:
## [1] 19500.19
Minimum distance between samples:
## [1] 0
Correlograms:
MP_18s_ss8522_mantel_correl_by_1000km<-mantel.correlog(tb18_tax_occur_ss8522_no_cero.bray, D.geo=geo_distances_MP_18S, break.pts=seq(0,20000, by=1000))
plot(MP_18s_ss8522_mantel_correl_by_1000km)
MP_18s_ss8522_mantel_correl_by_100km<-mantel.correlog(tb18_tax_occur_ss8522_no_cero.bray, D.geo=geo_distances_MP_18S, break.pts=seq(0,20000, by=100))
plot(MP_18s_ss8522_mantel_correl_by_100km)
In the following plot, we can appreciate the OTUs distribution according to their percentage of occurence and relative abundance. The red line keeps up OTUs that occur in more than 80% of the samples, the green line limits regionally rare OTUs (< 0.001%), and the blue one restricts regionally abundant OTUs (> 0.1%).
Regionally abundant OTUs (relative abundance over 0.1%):
## mean_rabund perc_occur
## OTU_2 0.048581148 100.00000
## OTU_6 0.035359462 98.29060
## OTU_23 0.012299990 89.74359
## OTU_8 0.011794511 93.16239
## OTU_25 0.010073475 97.43590
## OTU_27 0.009878906 99.14530
## OTU_1 0.008593144 100.00000
## OTU_12 0.008045541 99.14530
## OTU_9 0.007960292 64.10256
## OTU_7 0.007622303 97.43590
## OTU_14696 0.007580180 83.76068
## OTU_13 0.007133874 95.72650
## OTU_4 0.006976413 98.29060
## OTU_20 0.006860073 49.57265
## OTU_11 0.006806917 97.43590
## OTU_24 0.006582260 95.72650
## OTU_18 0.006405743 99.14530
## OTU_3 0.005555255 80.34188
## OTU_34 0.005196204 52.13675
## OTU_5 0.004919394 77.77778
## OTU_119 0.004804057 19.65812
## OTU_21 0.004780989 86.32479
## OTU_22 0.004643587 79.48718
## OTU_26 0.004545300 88.88889
## OTU_45 0.004444003 98.29060
## OTU_10 0.004252443 82.05128
## OTU_64 0.004210319 100.00000
## OTU_19 0.004174214 95.72650
## OTU_17 0.004143123 88.88889
## OTU_40 0.004006724 94.87179
## OTU_58 0.004004718 61.53846
## OTU_70 0.003977639 67.52137
## OTU_36 0.003915457 85.47009
## OTU_106 0.003816166 77.77778
## OTU_50 0.003762008 88.03419
## OTU_28 0.003607556 87.17949
## OTU_44 0.003544371 40.17094
## OTU_14 0.003438060 86.32479
## OTU_5618 0.003410981 88.03419
## OTU_55 0.003296646 97.43590
## OTU_31 0.003216411 93.16239
## OTU_42 0.003177297 92.30769
## OTU_15 0.003155232 39.31624
## OTU_81 0.003112106 90.59829
## OTU_38 0.003099068 85.47009
## OTU_32 0.003073995 64.95726
## OTU_39 0.003058950 99.14530
## OTU_51 0.003056945 84.61538
## OTU_30 0.003013818 56.41026
## OTU_47 0.002972698 62.39316
## OTU_80 0.002937595 58.11966
## OTU_33 0.002870399 94.01709
## OTU_767 0.002864381 24.78632
## OTU_35799 0.002829278 95.72650
## OTU_48 0.002829278 82.90598
## OTU_41 0.002816240 87.17949
## OTU_142 0.002807214 69.23077
## OTU_72 0.002806211 94.87179
## OTU_65 0.002804205 93.16239
## OTU_37 0.002761079 90.59829
## OTU_46 0.002755061 25.64103
## OTU_84 0.002724973 58.97436
## OTU_29 0.002699900 95.72650
## OTU_52 0.002675829 76.92308
## OTU_75 0.002624680 93.16239
## OTU_35 0.002588574 79.48718
## OTU_49 0.002573530 95.72650
## OTU_56 0.002547454 52.99145
## OTU_71 0.002504328 95.72650
## OTU_74 0.002477248 87.17949
## OTU_43 0.002406040 60.68376
## OTU_62 0.002397014 74.35897
## OTU_85 0.002348873 82.05128
## OTU_122 0.002346867 55.55556
## OTU_66 0.002274656 92.30769
## OTU_101 0.002262620 74.35897
## OTU_137 0.002255600 55.55556
## OTU_68 0.002226515 90.59829
## OTU_16 0.002212474 23.07692
## OTU_76 0.002211471 81.19658
## OTU_69 0.002208462 50.42735
## OTU_54 0.002168345 84.61538
## OTU_7889 0.002162327 66.66667
## OTU_102 0.002134245 68.37607
## OTU_11454 0.002108169 97.43590
## OTU_78 0.002087107 93.16239
## OTU_6315 0.002086104 82.90598
## OTU_73 0.002000855 64.95726
## OTU_92 0.001981799 66.66667
## OTU_82 0.001969764 76.92308
## OTU_89 0.001891535 57.26496
## OTU_87 0.001822332 82.90598
## OTU_96 0.001811300 58.97436
## OTU_57 0.001778203 86.32479
## OTU_126 0.001746109 94.01709
## OTU_124 0.001727053 73.50427
## OTU_53 0.001714015 24.78632
## OTU_109 0.001666877 54.70085
## OTU_59 0.001639798 70.08547
## OTU_3988 0.001636789 69.23077
## OTU_90 0.001628766 91.45299
## OTU_115 0.001620742 38.46154
## OTU_116 0.001612719 68.37607
## OTU_61 0.001593663 75.21368
## OTU_67 0.001590654 91.45299
## OTU_121 0.001562572 93.16239
## OTU_110 0.001542513 76.92308
## OTU_203 0.001542513 17.94872
## OTU_98 0.001535493 74.35897
## OTU_79 0.001509417 56.41026
## OTU_83 0.001469299 62.39316
## OTU_77 0.001456261 55.55556
## OTU_88 0.001451246 86.32479
## OTU_104 0.001437205 68.37607
## OTU_95 0.001428179 72.64957
## OTU_1882 0.001421158 59.82906
## OTU_112 0.001404108 84.61538
## OTU_103 0.001398091 78.63248
## OTU_99 0.001390067 59.82906
## OTU_63 0.001349950 41.02564
## OTU_108 0.001343932 60.68376
## OTU_243 0.001324877 94.01709
## OTU_1136 0.001319862 91.45299
## OTU_135 0.001294789 79.48718
## OTU_94 0.001286765 89.74359
## OTU_162 0.001285762 62.39316
## OTU_177 0.001268712 85.47009
## OTU_35494 0.001260689 58.97436
## OTU_111 0.001259686 88.03419
## OTU_128 0.001254671 77.77778
## OTU_91 0.001250659 63.24786
## OTU_174 0.001243639 54.70085
## OTU_188 0.001239627 91.45299
## OTU_146 0.001212548 61.53846
## OTU_192 0.001206530 82.90598
## OTU_120 0.001196501 60.68376
## OTU_125 0.001194495 86.32479
## OTU_113 0.001186472 63.24786
## OTU_132 0.001175439 45.29915
## OTU_105 0.001168419 39.31624
## OTU_131 0.001158389 89.74359
## OTU_145 0.001152372 74.35897
## OTU_93 0.001141340 73.50427
## OTU_165 0.001141340 68.37607
## OTU_152 0.001128301 67.52137
## OTU_157 0.001120278 29.05983
## OTU_166 0.001107240 57.26496
## OTU_141 0.001099216 55.55556
## OTU_11833 0.001077152 76.06838
## OTU_338 0.001069128 82.90598
## OTU_1819 0.001066119 48.71795
## OTU_180 0.001064114 57.26496
## OTU_133 0.001058096 50.42735
## OTU_163 0.001056090 61.53846
## OTU_140 0.001042049 71.79487
## OTU_151 0.001040043 57.26496
## OTU_7573 0.001031017 82.90598
## OTU_235 0.001030014 71.79487
## OTU_136 0.001023996 44.44444
## OTU_130 0.001015973 88.03419
## OTU_129 0.001015973 54.70085
## OTU_123 0.001009955 61.53846
dim(tb18_ss8522_abundant_sorted)
## [1] 162 2
Proportion of regionally abundant OTUs (%):
## [1] 0.6430868
Cosmopolitan OTUs (relative abundance over 0.1% and occurence in more than 80% of samples):
## mean_rabund perc_occur
## OTU_2 0.048581148 100.00000
## OTU_1 0.008593144 100.00000
## OTU_64 0.004210319 100.00000
## OTU_27 0.009878906 99.14530
## OTU_12 0.008045541 99.14530
## OTU_18 0.006405743 99.14530
## OTU_39 0.003058950 99.14530
## OTU_6 0.035359462 98.29060
## OTU_4 0.006976413 98.29060
## OTU_45 0.004444003 98.29060
## OTU_25 0.010073475 97.43590
## OTU_7 0.007622303 97.43590
## OTU_11 0.006806917 97.43590
## OTU_55 0.003296646 97.43590
## OTU_11454 0.002108169 97.43590
## OTU_13 0.007133874 95.72650
## OTU_24 0.006582260 95.72650
## OTU_19 0.004174214 95.72650
## OTU_35799 0.002829278 95.72650
## OTU_29 0.002699900 95.72650
## OTU_49 0.002573530 95.72650
## OTU_71 0.002504328 95.72650
## OTU_40 0.004006724 94.87179
## OTU_72 0.002806211 94.87179
## OTU_33 0.002870399 94.01709
## OTU_126 0.001746109 94.01709
## OTU_243 0.001324877 94.01709
## OTU_8 0.011794511 93.16239
## OTU_31 0.003216411 93.16239
## OTU_65 0.002804205 93.16239
## OTU_75 0.002624680 93.16239
## OTU_78 0.002087107 93.16239
## OTU_121 0.001562572 93.16239
## OTU_42 0.003177297 92.30769
## OTU_66 0.002274656 92.30769
## OTU_90 0.001628766 91.45299
## OTU_67 0.001590654 91.45299
## OTU_1136 0.001319862 91.45299
## OTU_188 0.001239627 91.45299
## OTU_81 0.003112106 90.59829
## OTU_37 0.002761079 90.59829
## OTU_68 0.002226515 90.59829
## OTU_23 0.012299990 89.74359
## OTU_94 0.001286765 89.74359
## OTU_131 0.001158389 89.74359
## OTU_26 0.004545300 88.88889
## OTU_17 0.004143123 88.88889
## OTU_50 0.003762008 88.03419
## OTU_5618 0.003410981 88.03419
## OTU_111 0.001259686 88.03419
## OTU_130 0.001015973 88.03419
## OTU_28 0.003607556 87.17949
## OTU_41 0.002816240 87.17949
## OTU_74 0.002477248 87.17949
## OTU_21 0.004780989 86.32479
## OTU_14 0.003438060 86.32479
## OTU_57 0.001778203 86.32479
## OTU_88 0.001451246 86.32479
## OTU_125 0.001194495 86.32479
## OTU_36 0.003915457 85.47009
## OTU_38 0.003099068 85.47009
## OTU_177 0.001268712 85.47009
## OTU_51 0.003056945 84.61538
## OTU_54 0.002168345 84.61538
## OTU_112 0.001404108 84.61538
## OTU_14696 0.007580180 83.76068
## OTU_48 0.002829278 82.90598
## OTU_6315 0.002086104 82.90598
## OTU_87 0.001822332 82.90598
## OTU_192 0.001206530 82.90598
## OTU_338 0.001069128 82.90598
## OTU_7573 0.001031017 82.90598
## OTU_10 0.004252443 82.05128
## OTU_85 0.002348873 82.05128
## OTU_76 0.002211471 81.19658
## OTU_3 0.005555255 80.34188
dim(otu_tb18_ss8522_cosmop_sorted)
## [1] 76 2
Proportion of cosmopolitan OTUs (%):
## [1] 0.254059
Number and proportion (%) of rare OTUs:
dim(otu_tb18_ss8522_rabund_percoccur[otu_tb18_ss8522_rabund_percoccur$mean_rabund < 0.00001 & otu_tb18_ss8522_rabund_percoccur$mean_rabund >0,])
## [1] 20730 2
## [1] 73.64138
Let’s add the taxonomic classification to the left OTUs by merging “tb18_tax_occur_ss8522_no_cero” with “tb18_tax”:
## [1] 26713 125
## Row.names st001 st002 st003 st004
## 1 OTU_1 26 23 34 37
## 2 OTU_10 2 9 38 40
## 3 OTU_100 13 30 4 6
## 4 OTU_1000 0 0 1 1
## 5 OTU_10000 0 0 0 1
## [1] 26713 124
## st001 st002 st003 st004 st005
## OTU_1 26 23 34 37 24
## OTU_10 2 9 38 40 11
## OTU_100 13 30 4 6 8
## OTU_1000 0 0 1 1 0
## OTU_10000 0 0 0 1 0
## [1] 26713 125
## st001 st002 st003 st004 st005
## OTU_1 26 23 34 37 24
## OTU_2 25 44 131 189 141
## OTU_3 49 241 237 94 76
## OTU_4 8 3 2 102 32
## OTU_5 0 24 48 31 8
## [1] 7727 125
## st001 st002 st003 st004 st005
## OTU_1 26 23 34 37 24
## OTU_4 8 3 2 102 32
## OTU_18 53 49 57 78 67
## OTU_25 113 106 119 147 205
## OTU_33 31 23 28 11 32
## st001 st002 st003 st004 st005
## OTU_1 26 23 34 37 24
## OTU_4 8 3 2 102 32
## OTU_18 53 49 57 78 67
## OTU_25 113 106 119 147 205
## OTU_33 31 23 28 11 32
## [1] 6451 117
## [1] 117
## [1] 117
## [1] 117
## [1] 111
## [1] 116
## [1] 95
## [1] 90
## [1] 97
## [1] 95
## [1] 89
## [1] 47
## [1] 32
## [1] 11
## [1] 25
## [1] 19
## [1] 27
## [1] 6
## Group.1 x
## 8 Dinophyceae 137924
## 13 Prasinophyceae 20175
## 5 Chrysophyceae 16440
## 11 Pelagophyceae 9270
## 7 Dictyochophyceae 8084
## 6 Cryptomonadales 2110
## 1 Bacillariophyta 765
## 3 Chlorarachniophyta 728
## 2 Bolidophyceae 573
## 12 Pinguiochysidales 301
## 14 Prymnesiophyceae 100
## 10 Mamiellophyceae 77
## 15 Raphydophyceae 41
## 4 Chlorophyceae 40
## 9 Eustigmatales 40
## 17 Ulvophyceae 30
## 16 Trebouxiophyceae 25
## Group.1 x
## 8 Dinophyceae 6451
## 13 Prasinophyceae 300
## 5 Chrysophyceae 257
## 11 Pelagophyceae 225
## 7 Dictyochophyceae 172
## 1 Bacillariophyta 68
## 3 Chlorarachniophyta 57
## 6 Cryptomonadales 55
## 14 Prymnesiophyceae 37
## 10 Mamiellophyceae 26
## 4 Chlorophyceae 24
## 2 Bolidophyceae 15
## 12 Pinguiochysidales 13
## 17 Ulvophyceae 13
## 9 Eustigmatales 6
## 16 Trebouxiophyceae 5
## 15 Raphydophyceae 3
## reads_per_class OTUs_per_class
## Bacillariophyta 765 68
## Bolidophyceae 573 15
## Chlorarachniophyta 728 57
## Chlorophyceae 40 24
## Chrysophyceae 16440 257
## reads_per_class OTUs_per_class samples_per_class
## Dinophyceae 137924 6451 117
## Prasinophyceae 20175 300 117
## Chrysophyceae 16440 257 117
## Pelagophyceae 9270 225 111
## Dictyochophyceae 8084 172 116
## Cryptomonadales 2110 55 95
## Bacillariophyta 765 68 90
## Chlorarachniophyta 728 57 97
## Bolidophyceae 573 15 95
## Pinguiochysidales 301 13 89
## Prymnesiophyceae 100 37 47
## Mamiellophyceae 77 26 32
## Raphydophyceae 41 3 27
## Chlorophyceae 40 24 25
## Eustigmatales 40 6 11
## Ulvophyceae 30 13 19
## Trebouxiophyceae 25 5 6
OTUs per class vs. samples in which they occur [PLOT DESCRIPTION]
Reads per class vs. OTUs per class [PLOT DESCRIPTION]
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## reads_per_class OTUs_per_class samples_per_class
## 100.000 100.000 1035.043
## reads_per_class OTUs_per_class samples_per_class
## Dinophyceae 70.11076488 83.48647599 100.000000
## Prasinophyceae 10.25553697 3.88248997 100.000000
## Chrysophyceae 8.35692827 3.32599974 100.000000
## Pelagophyceae 4.71220955 2.91186748 94.871795
## Dictyochophyceae 4.10933139 2.22596092 99.145299
## Cryptomonadales 1.07257413 0.71178983 81.196581
## Bacillariophyta 0.38887166 0.88003106 76.923077
## Chlorarachniophyta 0.37006349 0.73767309 82.905983
## Bolidophyceae 0.29127250 0.19412450 81.196581
## Pinguiochysidales 0.15300702 0.16824123 76.068376
## Prymnesiophyceae 0.05083290 0.47884043 40.170940
## Mamiellophyceae 0.03914133 0.33648246 27.350427
## Raphydophyceae 0.02084149 0.03882490 23.076923
## Chlorophyceae 0.02033316 0.31059920 21.367521
## Eustigmatales 0.02033316 0.07764980 9.401709
## Ulvophyceae 0.01524987 0.16824123 16.239316
## Trebouxiophyceae 0.01270822 0.06470817 5.128205
OTUs per class vs. samples in which they occur [PLOT DESCRIPTION]
## Warning in trans$transform(value): NaNs produced
Reads per class vs. OTUs per class [PLOT DESCRIPTION]
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(value): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## [1] 47144 125
## Row.names st001 st002 st003 st004
## 1 OTU_1 469 187 240 135
## 2 OTU_10 10 72 173 176
## 3 OTU_100 264 325 25 28
## 4 OTU_1000 4 1 8 5
## 5 OTU_10000 4 6 5 2
## [1] 47144 124
## st001 st002 st003 st004 st005
## OTU_1 469 187 240 135 184
## OTU_10 10 72 173 176 84
## OTU_100 264 325 25 28 58
## OTU_1000 4 1 8 5 0
## OTU_10000 4 6 5 2 0
## [1] 47144 125
## st001 st002 st003 st004 st005
## OTU_1 469 187 240 135 184
## OTU_2 445 389 760 819 1003
## OTU_3 1030 2196 1611 453 512
## OTU_4 135 66 36 365 146
## OTU_5 0 178 285 175 49
## [1] 14416 125
## st001 st002 st003 st004 st005
## OTU_1 469 187 240 135 184
## OTU_4 135 66 36 365 146
## OTU_18 908 383 349 349 508
## OTU_25 2214 1138 716 606 1386
## OTU_33 496 166 141 60 242
## st001 st002 st003 st004 st005
## OTU_1 469 187 240 135 184
## OTU_4 135 66 36 365 146
## OTU_18 908 383 349 349 508
## OTU_25 2214 1138 716 606 1386
## OTU_33 496 166 141 60 242
## [1] 12117 117
## [1] 117
## [1] 117
## [1] 117
## [1] 117
## [1] 113
## [1] 116
## [1] 103
## [1] 112
## [1] 112
## [1] 108
## [1] 111
## [1] 97
## [1] 65
## [1] 35
## [1] 75
## [1] 50
## [1] 75
## [1] 18
## Group.1 x
## 8 Dinophyceae 2057121
## 13 Prasinophyceae 246726
## 5 Chrysophyceae 217440
## 11 Pelagophyceae 142764
## 7 Dictyochophyceae 118388
## 6 Cryptomonadales 30359
## 3 Chlorarachniophyta 11722
## 1 Bacillariophyta 10858
## 2 Bolidophyceae 8322
## 12 Pinguiochysidales 5055
## 14 Prymnesiophyceae 1236
## 10 Mamiellophyceae 1170
## 9 Eustigmatales 741
## 15 Raphydophyceae 716
## 4 Chlorophyceae 669
## 17 Ulvophyceae 442
## 16 Trebouxiophyceae 247
## Group.1 x
## 8 Dinophyceae 12117
## 13 Prasinophyceae 523
## 11 Pelagophyceae 458
## 5 Chrysophyceae 431
## 7 Dictyochophyceae 276
## 1 Bacillariophyta 137
## 6 Cryptomonadales 105
## 3 Chlorarachniophyta 91
## 14 Prymnesiophyceae 79
## 4 Chlorophyceae 55
## 10 Mamiellophyceae 44
## 17 Ulvophyceae 26
## 12 Pinguiochysidales 25
## 2 Bolidophyceae 19
## 16 Trebouxiophyceae 14
## 9 Eustigmatales 11
## 15 Raphydophyceae 5
## reads_per_class OTUs_per_class
## Bacillariophyta 10858 137
## Bolidophyceae 8322 19
## Chlorarachniophyta 11722 91
## Chlorophyceae 669 55
## Chrysophyceae 217440 431
## reads_per_class OTUs_per_class samples_per_class
## Dinophyceae 2057121 12117 117
## Prasinophyceae 246726 523 117
## Chrysophyceae 217440 431 117
## Pelagophyceae 142764 458 113
## Dictyochophyceae 118388 276 116
## Cryptomonadales 30359 105 103
## Chlorarachniophyta 11722 91 112
## Bacillariophyta 10858 137 112
## Bolidophyceae 8322 19 108
## Pinguiochysidales 5055 25 111
## Prymnesiophyceae 1236 79 97
## Mamiellophyceae 1170 44 65
## Eustigmatales 741 11 35
## Raphydophyceae 716 5 75
## Chlorophyceae 669 55 75
## Ulvophyceae 442 26 50
## Trebouxiophyceae 247 14 18
## reads_per_class OTUs_per_class samples_per_class
## 100.000 100.000 1317.094
## reads_per_class OTUs_per_class samples_per_class
## Dinophyceae 72.079127505 84.05244173 100.00000
## Prasinophyceae 8.644992109 3.62791343 100.00000
## Chrysophyceae 7.618844727 2.98973363 100.00000
## Pelagophyceae 5.002284532 3.17702553 96.58120
## Dictyochophyceae 4.148177840 1.91453940 99.14530
## Cryptomonadales 1.063744054 0.72835738 88.03419
## Chlorarachniophyta 0.410725248 0.63124306 95.72650
## Bacillariophyta 0.380451693 0.95033296 95.72650
## Bolidophyceae 0.291593202 0.13179800 92.30769
## Pinguiochysidales 0.177121321 0.17341842 94.87179
## Prymnesiophyceae 0.043308003 0.54800222 82.90598
## Mamiellophyceae 0.040995439 0.30521643 55.55556
## Eustigmatales 0.025963778 0.07630411 29.91453
## Raphydophyceae 0.025087807 0.03468368 64.10256
## Chlorophyceae 0.023440982 0.38152053 64.10256
## Ulvophyceae 0.015487166 0.18035516 42.73504
## Trebouxiophyceae 0.008654593 0.09711432 15.38462
OTUs per class vs. samples in which they occur [PLOT DESCRIPTION]
Reads per class vs. OTUs per class [PLOT DESCRIPTION]
## Warning in trans$transform(limits): NaNs produced
Let’s read the dataset and remove the samples containing less than 5836 reads:
## [1] 960 129
## OTUId st001 st002 st003 st004
## 1 OTU_1 72073 45381 40792 26317
## 2 OTU_2 11925 13467 6500 4802
## 3 OTU_3 19 12 10 21
## 4 OTU_4 1795 2861 1057 993
## 5 OTU_5 1766 1341 463 499
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## [1] 960 128
## st001 st002 st003 st004 st005
## OTU_1 72073 45381 40792 26317 65068
## OTU_2 11925 13467 6500 4802 10947
## OTU_3 19 12 10 21 4
## OTU_4 1795 2861 1057 993 2466
## OTU_5 1766 1341 463 499 1912
## [1] 960 6
## SILVA_taxonomic_classification
## OTU_1 KC002097.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Prochlorococcus;unidentified_marine_bacterioplankton
## OTU_2 EU800388.1.1465_Bacteria;Proteobacteria;Alphaproteobacteria;SAR11_clade;Surface_1;uncultured_bacterium
## OTU_3 KJ590614.1.1421_Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;uncultured_bacterium
## OTU_4 EU805005.1.1343_Bacteria;Actinobacteria;Acidimicrobiia;Acidimicrobiales;OM1_clade;Candidatus_Actinomarina;uncultured_bacterium
## OTU_5 KC000519.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Synechococcus;unidentified_marine_bacterioplankton
## Phytoref_taxonomic_classification
## OTU_1 EU803052.1.1425|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Prochlorococcus|uncultured|uncultured+bacterium
## OTU_2 <NA>
## OTU_3 <NA>
## OTU_4 <NA>
## OTU_5 AY663935.1.1244|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Synechococcus|uncultured|uncultured+Synechococcus
## SILVA_funct_classif Phytoref_funct_classif class_A
## OTU_1 cyanob cyanob Cyanobacteria
## OTU_2 bacteria <NA> heterotrophic_bacteria
## OTU_3 bacteria <NA> heterotrophic_bacteria
## OTU_4 bacteria <NA> heterotrophic_bacteria
## OTU_5 cyanob cyanob Cyanobacteria
## class_B
## OTU_1 Prochlorococcus
## OTU_2 heterotrophic_bacteria
## OTU_3 heterotrophic_bacteria
## OTU_4 heterotrophic_bacteria
## OTU_5 Synechococcus
## [1] 960 122
## st001 st002 st003 st004 st005
## OTU_1 72073 45381 40792 26317 65068
## OTU_2 11925 13467 6500 4802 10947
## OTU_3 19 12 10 21 4
## OTU_4 1795 2861 1057 993 2466
## OTU_5 1766 1341 463 499 1912
## st122 st124 st137 st144
## 2727 3408 420 624
## [1] 960 118
## [1] 960 117
## [1] 960 0
Table dimensions and content outline:
## [1] 960 117
## st001 st002 st003 st004 st005
## OTU_1 72073 45381 40792 26317 65068
## OTU_2 11925 13467 6500 4802 10947
## OTU_3 19 12 10 21 4
## OTU_4 1795 2861 1057 993 2466
## OTU_5 1766 1341 463 499 1912
Minimum number of reads per station:
min(colSums(tb16_tax_occur_min5836))
## [1] 5836
Maximum number of reads per station:
max(colSums(tb16_tax_occur_min5836))
## [1] 156024
Identification of station with higher number of reads:
amplicons_per_sample<-colSums(tb16_tax_occur_min5836)
amplicons_per_sample[which(colSums(tb16_tax_occur_min5836)>150000)]
## st050
## 156024
Overall reads per sample:
Let’s normalize the original dataset by randomly subsampling 5836 reads in each station:
library(vegan)
tb16_tax_occur_min5836_t<-t(tb16_tax_occur_min5836)
tb16_tax_occur_ss5836<-rrarefy(tb16_tax_occur_min5836_t, 5836)
The normalized table shows the following dimensions and format:
## [1] 117 960
## OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001 4063 680 0 91 90
## st002 3473 954 1 214 87
## st003 4334 688 0 86 53
## st004 3881 706 3 155 84
## st005 4139 773 0 140 118
Its content fits with the expected normalization values (5836 reads per station):
rowSums(tb16_tax_occur_ss5836)
## st001 st002 st003 st004 st005 st007 st009 st010 st012 st013 st014 st015
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st017 st018 st019 st022 st023 st024 st025 st026 st027 st028 st029 st030
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st032 st033 st034 st035 st037 st038 st039 st040 st041 st043 st044 st045
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st046 st047 st049 st050 st051 st052 st053 st055 st056 st057 st058 st059
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st060 st062 st063 st064 st065 st066 st067 st068 st069 st070 st071 st072
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st073 st074 st075 st076 st077 st078 st079 st081 st082 st083 st085 st086
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st087 st088 st089 st092 st093 st094 st095 st096 st097 st098 st101 st102
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st103 st104 st106 st107 st108 st109 st110 st112 st114 st115 st116 st118
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st119 st120 st121 st125 st126 st127 st128 st129 st130 st131 st132 st133
## 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836 5836
## st134 st135 st138 st139 st140 st141 st143 st145 st146
## 5836 5836 5836 5836 5836 5836 5836 5836 5836
Let’s check out how many OTUs don’t appear in the new table:
length(which(colSums(tb16_tax_occur_ss5836)==0))
## [1] 39
There are 54 OTUs that don’t show any occurrence in the normalized data. Let’s remove them from the table and take a look at its final dimensions:
tb16_tax_occur_ss5836_no_cero<-tb16_tax_occur_ss5836[,-(which(colSums(tb16_tax_occur_ss5836)==0))]
dim(tb16_tax_occur_ss5836_no_cero)
## [1] 117 921
Datasets summary: dim(tb16_tax) –> 993 128 dim(tb16_tax_occur) –> 960 122 dim(tb16_tax_occur_ss5836_no_cero) –> 117 909
Most of the samples take Shannon Index values between 5.5 and 6:
Lowest number of OTUs per sample:
## [1] 136
Maximum number of OTUs per sample:
## [1] 356
In most of the samples, we can identify between 1000 and 1300 OTUs:
plot(sort(OTUs_per_sample_16S_tax_occur_ss5836), pch=19)
boxplot(OTUs_per_sample_16S_tax_occur_ss5836, pch=19)
pielou_evenness_16S_tax_occur_ss5836 <- tb16_tax_occur_ss5836_div/log(OTUs_per_sample_16S_tax_occur_ss5836)
The Pielou index (constrained between 0 and 1) takes values closer to 1 as the variation of species proportion in a sample decreases. Most of the samples get values between 0.8 and 0.9, meaning that the numerical composition of different OTUs in a sample is highly similar:
plot(sort(pielou_evenness_16S_tax_occur_ss5836), pch=19)
boxplot(pielou_evenness_16S_tax_occur_ss5836, pch=19)
The OTU_2, with 27292 reads, is the most abundant in the overall dataset:
head(sort(colSums(tb16_tax_occur_ss5836_no_cero), decreasing=T), n=10L)
## OTU_1 OTU_2 OTU_4 OTU_5 OTU_3 OTU_6 OTU_552 OTU_10 OTU_23
## 277010 97628 34553 26999 12085 11019 7175 6737 6383
## OTU_435
## 6063
Most of the OTUs show very few occurrences; the plot suggests that we will probably be able to identify a significant ammount of rare otus:
plot(log(sort(colSums(tb16_tax_occur_ss5836_no_cero), decreasing=T)), pch=19)
The OTUs abundance distribution fits relativelly close to log-normal model.
According to Preston’s lognormal model fit into species frequencies groups, we’re missing ~3226 species:
tb16_tax_occur_ss5836_prestonfit<-prestonfit(colSums(tb16_tax_occur_min5836_t))
plot(tb16_tax_occur_ss5836_prestonfit, main="Pooled species")
veiledspec(tb16_tax_occur_ss5836_prestonfit)
## Extrapolated Observed Veiled
## 972.72511 960.00000 12.72511
When computing Prestons’ lognormal model fit without pooling data into groups, we seem to miss ~3055 species:
## Extrapolated Observed Veiled
## 964.490456 960.000000 4.490456
The Bray-Curtis dissimilarity, constrained between 0 (minimum distance) and 1 (highest dissimilarity) allows us to quantify the differences between samples according to the composition and relative abundance of their OTUs. In our dataset, most of the samples pairs take dissimilarity values between between 0.7 and 0.8, meaning that their composition is substantially different.
The only relatively evident cluster we can distinguish in the dendogram stands out in the very left side of the plot.
(To be done: assign Longhurst provinces information to each station and check if any of the central clusters is meaningful regarding to the samples’ geographical ubication)
We can identify a prominent group in the central part of the NMDS plot and a few outliers in the middle-left edge of the plot. The stress parameter takes a value below 0.3, suggesting that the plot is acceptable.
##
## Call:
## monoMDS(dist = tb16_tax_occur_ss5836_no_cero.bray)
##
## Non-metric Multidimensional Scaling
##
## 117 points, dissimilarity 'bray', call 'vegdist(x = tb16_tax_occur_ss5836_no_cero, method = "bray")'
##
## Dimensions: 2
## Stress: 0.1330533
## Stress type 1, weak ties
## Scores scaled to unit root mean square, rotated to principal components
## Stopped after 118 iterations: Stress nearly unchanged (ratio > sratmax)
When implementing a most robut function for computing NMDS plots, the result is quiet the same:
## Run 0 stress 0.1248949
## Run 1 stress 0.1258697
## Run 2 stress 0.1245297
## ... New best solution
## ... procrustes: rmse 0.006778966 max resid 0.05528081
## Run 3 stress 0.1277239
## Run 4 stress 0.1255918
## Run 5 stress 0.1278339
## Run 6 stress 0.126939
## Run 7 stress 0.1258749
## Run 8 stress 0.1297871
## Run 9 stress 0.1245368
## ... procrustes: rmse 0.0009401818 max resid 0.007321792
## *** Solution reached
## Warning in ordiplot(x, choices = choices, type = type, display = display, :
## Species scores not available
Working datasets:
dim(tb16_tax_occur_ss5836_no_cero)
## [1] 117 921
tb16_tax_occur_ss5836_no_cero[1:5, 1:5]
## OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001 4063 680 0 91 90
## st002 3473 954 1 214 87
## st003 4334 688 0 86 53
## st004 3881 706 3 155 84
## st005 4139 773 0 140 118
## NULL
tb16_tax_occur_ss5836_no_cero.bray<-as.matrix(tb16_tax_occur_ss5836_no_cero.bray)
dim(geo_distances_MP_16S)
## [1] 117 117
Communities quickly change their composition across geographical distances:
plot(geo_distances_MP_16S, tb16_tax_occur_ss5836_no_cero.bray, pch=19, cex=0.4, xlab="Geopgraphical distances", ylab="Bray-Curtis dissimilarities")
Mantel statistic is -significantlly- so low, meaning that the correlation between samples dissimilarity and geographical distances is weak.
mantel(geo_distances_MP_16S, tb16_tax_occur_ss5836_no_cero.bray)
##
## Mantel statistic based on Pearson's product-moment correlation
##
## Call:
## mantel(xdis = geo_distances_MP_16S, ydis = tb16_tax_occur_ss5836_no_cero.bray)
##
## Mantel statistic r: 0.01288
## Significance: 0.199
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0230 0.0303 0.0395 0.0543
## Permutation: free
## Number of permutations: 999
Maximum distance between samples:
## [1] 19500.19
Minimum distance between samples:
## [1] 0
Correlograms:
MP_16s_ss5836_mantel_correl_by_1000km<-mantel.correlog(tb16_tax_occur_ss5836_no_cero.bray, D.geo=geo_distances_MP_16S, break.pts=seq(0,20000, by=1000))
plot(MP_16s_ss5836_mantel_correl_by_1000km)
MP_16s_ss5836_mantel_correl_by_100km<-mantel.correlog(tb16_tax_occur_ss5836_no_cero.bray, D.geo=geo_distances_MP_16S, break.pts=seq(0,20000, by=100))
plot(MP_16s_ss5836_mantel_correl_by_100km)
mantel(tb18_tax_occur_ss8522_no_cero.bray, tb16_tax_occur_ss5836_no_cero.bray)
##
## Mantel statistic based on Pearson's product-moment correlation
##
## Call:
## mantel(xdis = tb18_tax_occur_ss8522_no_cero.bray, ydis = tb16_tax_occur_ss5836_no_cero.bray)
##
## Mantel statistic r: 0.6253
## Significance: 0.001
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0808 0.1084 0.1300 0.1689
## Permutation: free
## Number of permutations: 999
In the following plot, we can appreciate the OTUs distribution according to their percentage of occurence and relative abundance. The red line keeps up OTUs that occur in more than 80% of the samples, the green line limits regionally rare OTUs (< 0.001%), and the blue one restricts regionally abundant OTUs (> 0.1%).
Regionally abundant OTUs (relative abundance over 0.1%):
## mean_rabund perc_occur
## OTU_1 0.405690000 100.00000
## OTU_2 0.142979327 100.00000
## OTU_4 0.050603973 100.00000
## OTU_5 0.039540899 100.00000
## OTU_3 0.017698869 92.30769
## OTU_6 0.016137678 100.00000
## OTU_552 0.010508017 99.14530
## OTU_10 0.009866552 100.00000
## OTU_23 0.009348108 99.14530
## OTU_435 0.008879457 100.00000
## OTU_104 0.008807695 100.00000
## OTU_12 0.007561379 100.00000
## OTU_9 0.007381241 100.00000
## OTU_13 0.006978495 100.00000
## OTU_28 0.006380966 100.00000
## OTU_20 0.006092453 94.01709
## OTU_7 0.005767327 91.45299
## OTU_34 0.005027738 94.87179
## OTU_198 0.004707006 100.00000
## OTU_11 0.004608882 99.14530
## OTU_8 0.004598630 98.29060
## OTU_19 0.004204671 100.00000
## OTU_368 0.004172452 94.87179
## OTU_234 0.004042108 99.14530
## OTU_25 0.003926410 100.00000
## OTU_33 0.003450437 96.58120
## OTU_14 0.003391856 99.14530
## OTU_37 0.003296661 98.29060
## OTU_40 0.003242474 76.06838
## OTU_21 0.003226364 99.14530
## OTU_29 0.003060872 99.14530
## OTU_15 0.002930528 93.16239
## OTU_24 0.002756249 83.76068
## OTU_45 0.002631764 100.00000
## OTU_16 0.002501421 75.21368
## OTU_709 0.002489704 90.59829
## OTU_27 0.002316890 82.90598
## OTU_102 0.002255379 98.29060
## OTU_36 0.002250986 94.01709
## OTU_39 0.002195333 99.14530
## OTU_70 0.002168972 100.00000
## OTU_724 0.002139681 100.00000
## OTU_240 0.002034235 82.90598
## OTU_139 0.002019590 94.87179
## OTU_166 0.001999086 96.58120
## OTU_35 0.001990299 85.47009
## OTU_44 0.001985905 93.16239
## OTU_18 0.001980047 95.72650
## OTU_17 0.001947828 65.81197
## OTU_163 0.001884853 97.43590
## OTU_654 0.001873136 94.01709
## OTU_22 0.001829200 75.21368
## OTU_694 0.001796981 91.45299
## OTU_55 0.001725219 89.74359
## OTU_90 0.001675425 99.14530
## OTU_177 0.001675425 86.32479
## OTU_26 0.001646134 71.79487
## OTU_32 0.001635882 81.19658
## OTU_63 0.001572907 78.63248
## OTU_43 0.001562656 66.66667
## OTU_41 0.001530436 92.30769
## OTU_49 0.001410344 63.24786
## OTU_451 0.001401557 93.16239
## OTU_247 0.001400093 97.43590
## OTU_56 0.001392770 88.03419
## OTU_62 0.001378125 90.59829
## OTU_42 0.001353228 98.29060
## OTU_31 0.001241923 40.17094
## OTU_93 0.001225813 93.16239
## OTU_84 0.001221420 91.45299
## OTU_98 0.001211168 88.03419
## OTU_61 0.001177484 14.52991
## OTU_46 0.001155516 64.95726
## OTU_52 0.001124761 51.28205
## OTU_404 0.001121831 98.29060
## OTU_89 0.001121831 94.87179
## OTU_48 0.001076431 79.48718
## OTU_147 0.001074966 95.72650
## OTU_103 0.001058857 77.77778
## OTU_74 0.001017850 64.10256
## OTU_67 0.001017850 47.00855
## OTU_66 0.001016385 66.66667
## OTU_59 0.001006133 35.04274
dim(tb16_ss5836_abundant_sorted)
## [1] 83 2
Proportion of regionally abundant OTUs (%):
## [1] 0.6430868
Cosmopolitan OTUs (relative abundance over 0.1% and occurence in more than 80% of samples):
## mean_rabund perc_occur
## OTU_1 0.405690000 100.00000
## OTU_2 0.142979327 100.00000
## OTU_4 0.050603973 100.00000
## OTU_5 0.039540899 100.00000
## OTU_6 0.016137678 100.00000
## OTU_10 0.009866552 100.00000
## OTU_435 0.008879457 100.00000
## OTU_104 0.008807695 100.00000
## OTU_12 0.007561379 100.00000
## OTU_9 0.007381241 100.00000
## OTU_13 0.006978495 100.00000
## OTU_28 0.006380966 100.00000
## OTU_198 0.004707006 100.00000
## OTU_19 0.004204671 100.00000
## OTU_25 0.003926410 100.00000
## OTU_45 0.002631764 100.00000
## OTU_70 0.002168972 100.00000
## OTU_724 0.002139681 100.00000
## OTU_552 0.010508017 99.14530
## OTU_23 0.009348108 99.14530
## OTU_11 0.004608882 99.14530
## OTU_234 0.004042108 99.14530
## OTU_14 0.003391856 99.14530
## OTU_21 0.003226364 99.14530
## OTU_29 0.003060872 99.14530
## OTU_39 0.002195333 99.14530
## OTU_90 0.001675425 99.14530
## OTU_8 0.004598630 98.29060
## OTU_37 0.003296661 98.29060
## OTU_102 0.002255379 98.29060
## OTU_42 0.001353228 98.29060
## OTU_404 0.001121831 98.29060
## OTU_163 0.001884853 97.43590
## OTU_247 0.001400093 97.43590
## OTU_33 0.003450437 96.58120
## OTU_166 0.001999086 96.58120
## OTU_18 0.001980047 95.72650
## OTU_147 0.001074966 95.72650
## OTU_34 0.005027738 94.87179
## OTU_368 0.004172452 94.87179
## OTU_139 0.002019590 94.87179
## OTU_89 0.001121831 94.87179
## OTU_20 0.006092453 94.01709
## OTU_36 0.002250986 94.01709
## OTU_654 0.001873136 94.01709
## OTU_15 0.002930528 93.16239
## OTU_44 0.001985905 93.16239
## OTU_451 0.001401557 93.16239
## OTU_93 0.001225813 93.16239
## OTU_3 0.017698869 92.30769
## OTU_41 0.001530436 92.30769
## OTU_7 0.005767327 91.45299
## OTU_694 0.001796981 91.45299
## OTU_84 0.001221420 91.45299
## OTU_709 0.002489704 90.59829
## OTU_62 0.001378125 90.59829
## OTU_55 0.001725219 89.74359
## OTU_56 0.001392770 88.03419
## OTU_98 0.001211168 88.03419
## OTU_177 0.001675425 86.32479
## OTU_35 0.001990299 85.47009
## OTU_24 0.002756249 83.76068
## OTU_27 0.002316890 82.90598
## OTU_240 0.002034235 82.90598
## OTU_32 0.001635882 81.19658
dim(otu_tb16_ss5836_cosmop_sorted)
## [1] 65 2
Proportion of cosmopolitan OTUs (%):
## [1] 0.254059
Number and proportion (%) of rare OTUs:
dim(otu_tb16_ss5836_rabund_percoccur[otu_tb16_ss5836_rabund_percoccur$mean_rabund < 0.00001 & otu_tb16_ss5836_rabund_percoccur$mean_rabund >0,])
## [1] 217 2
## [1] 65.70204
Let’s add the taxonomic classification to the left OTUs by merging “tb16_tax_occur_ss5836_no_cero” with “tb16_tax”.
## [1] 921 124
## Row.names st001 st002 st003 st004
## 1 OTU_1 4063 3473 4334 3881
## 2 OTU_10 16 36 14 21
## 3 OTU_100 0 0 0 0
## 4 OTU_1001 0 0 0 0
## 5 OTU_1004 0 0 0 0
## [1] 921 123
## st001 st002 st003 st004 st005
## OTU_1 4063 3473 4334 3881 4139
## OTU_10 16 36 14 21 11
## OTU_100 0 0 0 0 0
## OTU_1001 0 0 0 0 0
## OTU_1004 0 0 0 0 0
## [1] 921 124
## class_A class_B OTU_no
## OTU_1 Cyanobacteria Prochlorococcus 1
## OTU_2 heterotrophic_bacteria heterotrophic_bacteria 2
## OTU_3 heterotrophic_bacteria heterotrophic_bacteria 3
## OTU_4 heterotrophic_bacteria heterotrophic_bacteria 4
## OTU_5 Cyanobacteria Synechococcus 5
## [1] 87 124
## [1] 834 124
## st146
## OTU_54 3
## OTU_57 17
## OTU_58 1
## OTU_80 0
## OTU_114 0
## SILVA_taxonomic_classification
## OTU_54 EU237456.1.1310_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
## OTU_57 AY664138.1.1204_Bacteria;Cyanobacteria;Chloroplast;uncultured_phototrophic_eukaryote
## OTU_58 <NA>
## OTU_80 HM565911.2962.4423_Bacteria;Cyanobacteria;Chloroplast;uncultured_prymnesiophyte_C8704
## OTU_114 JN986350.1.1460_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
## Phytoref_taxonomic_classification
## OTU_54 5523_AB847984|Eukaryota|Hacrobia|Haptophyta|Prymnesiophyceae|Prymnesiophyceae_X|Prymnesiophyceae_XX|Prymnesiophyceae_XXX|Braarudosphaeraceae|Braarudosphaera|Braarudosphaera_bigelowii
## OTU_57 474_Eukaryota|Stramenopiles|Ochrophyta|Dictyochophyceae|Dictyochophyceae_X|Dictyochophyceae_XX|Dictyochophyceae_XXX|Dictyochophyceae_XXXX|Helicopedinella|Helicopedinella_sp
## OTU_58 432_DQ438491|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-9|Prasino-clade-9_X|Prasino-clade-9_XX|Prasino-clade-9_XXX|Prasino-clade-9_XXX_sp
## OTU_80 <NA>
## OTU_114 891_FJ649309|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-7|Prasino-clade-7_X|Prasino-clade-7_X-B|Prasino-clade-7_X-B1|Prasino-clade-7_X-B1_sp
## SILVA_funct_classif Phytoref_funct_classif class_A
## OTU_54 plastid plastid Prymnesiophyceae
## OTU_57 plastid plastid Dictyochophyceae
## OTU_58 <NA> plastid Prasinophyceae
## OTU_80 plastid <NA> other_plastids
## OTU_114 plastid plastid Prasinophyceae
## class_B OTU_no
## OTU_54 Prymnesiophyceae 54
## OTU_57 Dictyochophyceae 57
## OTU_58 Prasinophyceae 58
## OTU_80 other_plastids 80
## OTU_114 Prasinophyceae 114
## st146
## OTU_1 1990
## OTU_2 960
## OTU_3 79
## OTU_4 155
## OTU_5 163
## SILVA_taxonomic_classification
## OTU_1 KC002097.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Prochlorococcus;unidentified_marine_bacterioplankton
## OTU_2 EU800388.1.1465_Bacteria;Proteobacteria;Alphaproteobacteria;SAR11_clade;Surface_1;uncultured_bacterium
## OTU_3 KJ590614.1.1421_Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;uncultured_bacterium
## OTU_4 EU805005.1.1343_Bacteria;Actinobacteria;Acidimicrobiia;Acidimicrobiales;OM1_clade;Candidatus_Actinomarina;uncultured_bacterium
## OTU_5 KC000519.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Synechococcus;unidentified_marine_bacterioplankton
## Phytoref_taxonomic_classification
## OTU_1 EU803052.1.1425|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Prochlorococcus|uncultured|uncultured+bacterium
## OTU_2 <NA>
## OTU_3 <NA>
## OTU_4 <NA>
## OTU_5 AY663935.1.1244|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Synechococcus|uncultured|uncultured+Synechococcus
## SILVA_funct_classif Phytoref_funct_classif class_A
## OTU_1 cyanob cyanob Cyanobacteria
## OTU_2 bacteria <NA> heterotrophic_bacteria
## OTU_3 bacteria <NA> heterotrophic_bacteria
## OTU_4 bacteria <NA> heterotrophic_bacteria
## OTU_5 cyanob cyanob Cyanobacteria
## class_B OTU_no
## OTU_1 Prochlorococcus 1
## OTU_2 heterotrophic_bacteria 2
## OTU_3 heterotrophic_bacteria 3
## OTU_4 heterotrophic_bacteria 4
## OTU_5 Synechococcus 5
class_summary_reads_per_class_16S_protists<-aggregate(rowSums(tb16_protists[1:117]), list(tb16_protists$class_A), sum)
# count the different groups
class_summary_otus_per_class_16S_protists<-aggregate(rowSums(tb16_protists[1:117]), list(tb16_protists$class_A), length)
## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_protists)
class_summary_reads_per_class_16S_protists_order<-class_summary_reads_per_class_16S_protists[order(-x),]
detach(class_summary_reads_per_class_16S_protists)
class_summary_reads_per_class_16S_protists_order
## Group.1 x
## 10 Prymnesiophyceae 4335
## 7 other_plastids 2149
## 5 Dictyochophyceae 1056
## 9 Prasinophyceae 906
## 6 Mamiellophyceae 358
## 8 Pelagophyceae 69
## 1 Bacillariophyta 49
## 2 Bolidophyceae 35
## 4 Cryptomonadales 26
## 11 Rappemonad 26
## 3 Chlorarachniophyta 3
#fix column names
row.names(class_summary_reads_per_class_16S_protists_order)<-class_summary_reads_per_class_16S_protists_order[,1]
class_summary_reads_per_class_16S_protists_order<-class_summary_reads_per_class_16S_protists_order[c(-1)]
colnames(class_summary_reads_per_class_16S_protists_order)<-c("reads_per_class")
## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_protists)
class_summary_otus_per_class_16S_protists_order<-class_summary_otus_per_class_16S_protists[order(-x),]
detach(class_summary_otus_per_class_16S_protists)
class_summary_otus_per_class_16S_protists_order
## Group.1 x
## 7 other_plastids 31
## 10 Prymnesiophyceae 27
## 9 Prasinophyceae 10
## 5 Dictyochophyceae 7
## 6 Mamiellophyceae 4
## 1 Bacillariophyta 3
## 2 Bolidophyceae 1
## 3 Chlorarachniophyta 1
## 4 Cryptomonadales 1
## 8 Pelagophyceae 1
## 11 Rappemonad 1
row.names(class_summary_otus_per_class_16S_protists_order)<-class_summary_otus_per_class_16S_protists_order[,1]
class_summary_otus_per_class_16S_protists_order<-class_summary_otus_per_class_16S_protists_order[c(-1)]
colnames(class_summary_otus_per_class_16S_protists_order)<-c("OTUs_per_class")
class_summary_reads_per_class_16S_bacteria<-aggregate(rowSums(tb16_bacteria[1:117]), list(tb16_bacteria$class_A), sum)
# count the different groups
class_summary_otus_per_class_16S_bacteria<-aggregate(rowSums(tb16_bacteria[1:117]), list(tb16_bacteria$class_A), length)
## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_bacteria)
class_summary_reads_per_class_16S_bacteria_order<-class_summary_reads_per_class_16S_bacteria[order(-x),]
detach(class_summary_reads_per_class_16S_bacteria)
class_summary_reads_per_class_16S_bacteria_order
## Group.1 x
## 2 heterotrophic_bacteria 349867
## 1 Cyanobacteria 323933
#fix column names
row.names(class_summary_reads_per_class_16S_bacteria_order)<-class_summary_reads_per_class_16S_bacteria_order[,1]
class_summary_reads_per_class_16S_bacteria_order<-class_summary_reads_per_class_16S_bacteria_order[c(-1)]
colnames(class_summary_reads_per_class_16S_bacteria_order)<-c("reads_per_class")
## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_bacteria)
class_summary_otus_per_class_16S_bacteria_order<-class_summary_otus_per_class_16S_bacteria[order(-x),]
detach(class_summary_otus_per_class_16S_bacteria)
class_summary_otus_per_class_16S_bacteria_order
## Group.1 x
## 2 heterotrophic_bacteria 805
## 1 Cyanobacteria 29
row.names(class_summary_otus_per_class_16S_bacteria_order)<-class_summary_otus_per_class_16S_bacteria_order[,1]
class_summary_otus_per_class_16S_bacteria_order<-class_summary_otus_per_class_16S_bacteria_order[c(-1)]
colnames(class_summary_otus_per_class_16S_bacteria_order)<-c("OTUs_per_class")
## st001 st002 st003 st004 st005
## OTU_80 7 5 0 1 5
## OTU_146 0 3 1 1 3
## OTU_170 0 2 2 3 1
## OTU_182 0 1 0 2 0
## OTU_213 0 0 0 0 0
## st001 st002 st003 st004 st005
## OTU_80 7 5 0 1 5
## OTU_146 0 3 1 1 3
## OTU_170 0 2 2 3 1
## OTU_182 0 1 0 2 0
## OTU_213 0 0 0 0 0
## [1] 31 117
## [1] 115
## [1] 117
## [1] 96
## [1] 109
## [1] 26
## [1] 26
## [1] 25
## [1] 3
## [1] 14
## [1] 27
## [1] 22
## [1] 117
## [1] 117
## reads_per_class OTUs_per_class
## Bacillariophyta 49 3
## Bolidophyceae 35 1
## Chlorarachniophyta 3 1
## Cryptomonadales 26 1
## Dictyochophyceae 1056 7
## reads_per_class OTUs_per_class
## Cyanobacteria 323933 29
## heterotrophic_bacteria 349867 805
## NA NA NA
## NA.1 NA NA
## NA.2 NA NA
cyano_tb <- tb16_bacteria[which(tb16_bacteria$class_A != "heterotrophic_bacteria"),]
class_summary_reads_per_class_cyano<-aggregate(rowSums(cyano_tb[1:117]), list(cyano_tb$class_B), sum)
# count the different groups
class_summary_otus_per_class_cyano<-aggregate(rowSums(cyano_tb[1:117]), list(cyano_tb$class_B), length)
## READS PER CLASS ##
attach(class_summary_reads_per_class_cyano)
class_summary_reads_per_class_cyano_order<-class_summary_reads_per_class_cyano[order(-x),]
detach(class_summary_reads_per_class_cyano)
class_summary_reads_per_class_cyano_order
## Group.1 x
## 2 Prochlorococcus 281771
## 3 Synechococcus 40570
## 1 other_cyanobacteria 1592
#fix column names
row.names(class_summary_reads_per_class_cyano_order)<-class_summary_reads_per_class_cyano_order[,1]
class_summary_reads_per_class_cyano_order<-class_summary_reads_per_class_cyano_order[c(-1)]
colnames(class_summary_reads_per_class_cyano_order)<-c("reads_per_class")
## OTUs PER CLASS ##
attach(class_summary_otus_per_class_cyano)
class_summary_otus_per_class_cyano_order<-class_summary_otus_per_class_cyano[order(-x),]
detach(class_summary_otus_per_class_cyano)
class_summary_otus_per_class_cyano_order
## Group.1 x
## 1 other_cyanobacteria 14
## 2 Prochlorococcus 10
## 3 Synechococcus 5
row.names(class_summary_otus_per_class_cyano_order)<-class_summary_otus_per_class_cyano_order[,1]
class_summary_otus_per_class_cyano_order<-class_summary_otus_per_class_cyano_order[c(-1)]
colnames(class_summary_otus_per_class_cyano_order)<-c("OTUs_per_class")
# create_table_merging_#OTUs_#reads
cyano_OTUs_reads <- merge(class_summary_reads_per_class_cyano_order, class_summary_otus_per_class_cyano_order, by="row.names")
row.names(cyano_OTUs_reads)<-cyano_OTUs_reads[,1]
cyano_OTUs_reads<-cyano_OTUs_reads[,-1]
colnames(cyano_OTUs_reads)<-c("reads_per_class","OTUs_per_class")
cyano_OTUs_reads[1:3,1:2]
## reads_per_class OTUs_per_class
## other_cyanobacteria 1592 14
## Prochlorococcus 281771 10
## Synechococcus 40570 5
cyano_OTUs_reads<-cyano_OTUs_reads[order(cyano_OTUs_reads$reads_per_class, cyano_OTUs_reads$OTUs_per_class, decreasing = T), c(1,2)]
cyano_OTUs_reads
## reads_per_class OTUs_per_class
## Prochlorococcus 281771 10
## Synechococcus 40570 5
## other_cyanobacteria 1592 14
#compute relative values
cyano_OTUs_reads_rel_abund <- cyano_OTUs_reads
cyano_OTUs_reads_rel_abund$reads_per_class<-(cyano_OTUs_reads_rel_abund$reads_per_class*100)/colSums(cyano_OTUs_reads)[1]
cyano_OTUs_reads_rel_abund$OTUs_per_class<-(cyano_OTUs_reads_rel_abund$OTUs_per_class*100)/colSums(cyano_OTUs_reads)[2]
colSums(cyano_OTUs_reads_rel_abund)
## reads_per_class OTUs_per_class
## 100 100
rownames(cyano_OTUs_reads_rel_abund) = c("Prochlorococcus", "Synechococcus", "Other cyanobacteria")
cyano_OTUs_reads_rel_abund
## reads_per_class OTUs_per_class
## Prochlorococcus 86.9843455 34.48276
## Synechococcus 12.5241948 17.24138
## Other cyanobacteria 0.4914597 48.27586
cyano_OTUs_reads_rel_abund["cyano_group"]<-NA
cyano_OTUs_reads_rel_abund$cyano_group<-c("Prochlorococcus", "Synechococcus", "Other cyanobacteria")
cyano_OTUs_reads_rel_abund2<-matrix()
cyano_OTUs_reads_rel_abund2["group"]<-NA
cyano_OTUs_reads_rel_abund2$group<-c("Prochlorococcus","Prochlorococcus","Synechococcus","Synechococcus","Other cyanobacteria","Other cyanobacteria")
## Warning in cyano_OTUs_reads_rel_abund2$group <- c("Prochlorococcus",
## "Prochlorococcus", : Coercing LHS to a list
cyano_OTUs_reads_rel_abund2["value"]<-NA
cyano_OTUs_reads_rel_abund2$group<-c(86.9843455,34.48276,12.5241948,17.24138,0.4914597,48.27586)
cyano_OTUs_reads_rel_abund2["data"]<-NA
cyano_OTUs_reads_rel_abund2$group<-c("% of reads", "% of OTUs","% of reads", "% of OTUs","% of reads", "% of OTUs")
cyano_OTUs_reads_rel_abund2
## [[1]]
## [1] NA
##
## $group
## [1] "% of reads" "% of OTUs" "% of reads" "% of OTUs" "% of reads"
## [6] "% of OTUs"
##
## $value
## [1] NA
##
## $data
## [1] NA
#ggplot2.histogram(data=cyano_OTUs_reads_rel_abund, xName='OTUs_per_class',
# groupName='cyano_group', legendPosition="top",
# faceting=TRUE, facetingVarNames="cyano_group",
# facetingDirection="horizontal")
OTUs per class vs. samples in which they occur - PROTISTS [PLOT DESCRIPTION]
Reads per class vs. OTUs per class - PROTISTS [PLOT DESCRIPTION]
OTUs per class vs. samples in which they occur - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]
Reads per class vs. OTUs per class - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## reads_per_class OTUs_per_class samples_per_class
## 100.0000 100.0000 394.8718
## reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae 63.16479674 48.214286 100.000000
## Dictyochophyceae 15.38685706 12.500000 97.435897
## Prasinophyceae 13.20122395 17.857143 83.760684
## Mamiellophyceae 5.21637768 7.142857 26.495726
## Pelagophyceae 1.00539123 1.785714 22.222222
## Bacillariophyta 0.71397348 5.357143 17.948718
## Bolidophyceae 0.50998106 1.785714 23.076923
## Cryptomonadales 0.37884307 1.785714 11.111111
## Rappemonad 0.37884307 1.785714 11.111111
## Chlorarachniophyta 0.04371266 1.785714 1.709402
OTUs per class vs. samples in which they occur - PROTISTS [PLOT DESCRIPTION]
Reads per class vs. OTUs per class - PROTISTS [PLOT DESCRIPTION]
OTUs per class vs. samples in which they occur - PROTISTS vs. CYANOBACTERIA [PLOT DESCRIPTION]
## reads_per_class OTUs_per_class samples_per_class
## 100.0000 100.0000 494.8718
## reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae 1.310475e+00 31.764706 100.000000
## Dictyochophyceae 3.192300e-01 8.235294 97.435897
## Prasinophyceae 2.738848e-01 11.764706 83.760684
## Mamiellophyceae 1.082238e-01 4.705882 26.495726
## Pelagophyceae 2.085878e-02 1.176471 22.222222
## Bacillariophyta 1.481275e-02 3.529412 17.948718
## Bolidophyceae 1.058054e-02 1.176471 23.076923
## Cryptomonadales 7.859829e-03 1.176471 11.111111
## Rappemonad 7.859829e-03 1.176471 11.111111
## Chlorarachniophyta 9.069033e-04 1.176471 1.709402
## Cyanobacteria 9.792531e+01 34.117647 100.000000
## Warning in trans$transform(value): NaNs produced
Reads per class vs. OTUs per class - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]
## Warning in trans$transform(value): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced
## [1] 960 124
## Row.names st001 st002 st003 st004
## 1 OTU_1 72073 45381 40792 26317
## 2 OTU_10 317 449 135 164
## 3 OTU_100 0 0 0 0
## 4 OTU_1001 0 0 0 0
## 5 OTU_1002 0 0 0 0
## [1] 960 123
## st001 st002 st003 st004 st005
## OTU_1 72073 45381 40792 26317 65068
## OTU_10 317 449 135 164 170
## OTU_100 0 0 0 0 0
## OTU_1001 0 0 0 0 0
## OTU_1002 0 0 0 0 0
## [1] 960 124
## class_A class_B OTU_no
## OTU_1 Cyanobacteria Prochlorococcus 1
## OTU_2 heterotrophic_bacteria heterotrophic_bacteria 2
## OTU_3 heterotrophic_bacteria heterotrophic_bacteria 3
## OTU_4 heterotrophic_bacteria heterotrophic_bacteria 4
## OTU_5 Cyanobacteria Synechococcus 5
## [1] 87 124
## [1] 873 124
## st146
## OTU_54 4
## OTU_57 26
## OTU_58 1
## OTU_80 0
## OTU_114 0
## SILVA_taxonomic_classification
## OTU_54 EU237456.1.1310_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
## OTU_57 AY664138.1.1204_Bacteria;Cyanobacteria;Chloroplast;uncultured_phototrophic_eukaryote
## OTU_58 <NA>
## OTU_80 HM565911.2962.4423_Bacteria;Cyanobacteria;Chloroplast;uncultured_prymnesiophyte_C8704
## OTU_114 JN986350.1.1460_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
## Phytoref_taxonomic_classification
## OTU_54 5523_AB847984|Eukaryota|Hacrobia|Haptophyta|Prymnesiophyceae|Prymnesiophyceae_X|Prymnesiophyceae_XX|Prymnesiophyceae_XXX|Braarudosphaeraceae|Braarudosphaera|Braarudosphaera_bigelowii
## OTU_57 474_Eukaryota|Stramenopiles|Ochrophyta|Dictyochophyceae|Dictyochophyceae_X|Dictyochophyceae_XX|Dictyochophyceae_XXX|Dictyochophyceae_XXXX|Helicopedinella|Helicopedinella_sp
## OTU_58 432_DQ438491|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-9|Prasino-clade-9_X|Prasino-clade-9_XX|Prasino-clade-9_XXX|Prasino-clade-9_XXX_sp
## OTU_80 <NA>
## OTU_114 891_FJ649309|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-7|Prasino-clade-7_X|Prasino-clade-7_X-B|Prasino-clade-7_X-B1|Prasino-clade-7_X-B1_sp
## SILVA_funct_classif Phytoref_funct_classif class_A
## OTU_54 plastid plastid Prymnesiophyceae
## OTU_57 plastid plastid Dictyochophyceae
## OTU_58 <NA> plastid Prasinophyceae
## OTU_80 plastid <NA> other_plastids
## OTU_114 plastid plastid Prasinophyceae
## class_B OTU_no
## OTU_54 Prymnesiophyceae 54
## OTU_57 Dictyochophyceae 57
## OTU_58 Prasinophyceae 58
## OTU_80 other_plastids 80
## OTU_114 Prasinophyceae 114
## st146
## OTU_1 3284
## OTU_2 1540
## OTU_3 108
## OTU_4 236
## OTU_5 257
## SILVA_taxonomic_classification
## OTU_1 KC002097.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Prochlorococcus;unidentified_marine_bacterioplankton
## OTU_2 EU800388.1.1465_Bacteria;Proteobacteria;Alphaproteobacteria;SAR11_clade;Surface_1;uncultured_bacterium
## OTU_3 KJ590614.1.1421_Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;uncultured_bacterium
## OTU_4 EU805005.1.1343_Bacteria;Actinobacteria;Acidimicrobiia;Acidimicrobiales;OM1_clade;Candidatus_Actinomarina;uncultured_bacterium
## OTU_5 KC000519.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Synechococcus;unidentified_marine_bacterioplankton
## Phytoref_taxonomic_classification
## OTU_1 EU803052.1.1425|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Prochlorococcus|uncultured|uncultured+bacterium
## OTU_2 <NA>
## OTU_3 <NA>
## OTU_4 <NA>
## OTU_5 AY663935.1.1244|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Synechococcus|uncultured|uncultured+Synechococcus
## SILVA_funct_classif Phytoref_funct_classif class_A
## OTU_1 cyanob cyanob Cyanobacteria
## OTU_2 bacteria <NA> heterotrophic_bacteria
## OTU_3 bacteria <NA> heterotrophic_bacteria
## OTU_4 bacteria <NA> heterotrophic_bacteria
## OTU_5 cyanob cyanob Cyanobacteria
## class_B OTU_no
## OTU_1 Prochlorococcus 1
## OTU_2 heterotrophic_bacteria 2
## OTU_3 heterotrophic_bacteria 3
## OTU_4 heterotrophic_bacteria 4
## OTU_5 Synechococcus 5
class_summary_reads_per_class_16S_protists_non.norm<-aggregate(rowSums(tb16_protists_non.norm[1:117]), list(tb16_protists_non.norm$class_A), sum)
# count the different groups
class_summary_otus_per_class_16S_protists_non.norm<-aggregate(rowSums(tb16_protists_non.norm[1:117]), list(tb16_protists_non.norm$class_A), length)
## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_protists_non.norm)
class_summary_reads_per_class_16S_protists_non.norm_order<-class_summary_reads_per_class_16S_protists_non.norm[order(-x),]
detach(class_summary_reads_per_class_16S_protists_non.norm)
class_summary_reads_per_class_16S_protists_non.norm_order
## Group.1 x
## 10 Prymnesiophyceae 32950
## 7 other_plastids 16065
## 5 Dictyochophyceae 8988
## 9 Prasinophyceae 6579
## 6 Mamiellophyceae 1432
## 1 Bacillariophyta 509
## 8 Pelagophyceae 493
## 2 Bolidophyceae 267
## 4 Cryptomonadales 147
## 11 Rappemonad 133
## 3 Chlorarachniophyta 34
#fix column names
row.names(class_summary_reads_per_class_16S_protists_non.norm_order)<-class_summary_reads_per_class_16S_protists_non.norm_order[,1]
class_summary_reads_per_class_16S_protists_non.norm_order<-class_summary_reads_per_class_16S_protists_non.norm_order[c(-1)]
colnames(class_summary_reads_per_class_16S_protists_non.norm_order)<-c("reads_per_class")
## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_protists_non.norm)
class_summary_otus_per_class_16S_protists_non.norm_order<-class_summary_otus_per_class_16S_protists_non.norm[order(-x),]
detach(class_summary_otus_per_class_16S_protists_non.norm)
class_summary_otus_per_class_16S_protists_non.norm_order
## Group.1 x
## 7 other_plastids 31
## 10 Prymnesiophyceae 27
## 9 Prasinophyceae 10
## 5 Dictyochophyceae 7
## 6 Mamiellophyceae 4
## 1 Bacillariophyta 3
## 2 Bolidophyceae 1
## 3 Chlorarachniophyta 1
## 4 Cryptomonadales 1
## 8 Pelagophyceae 1
## 11 Rappemonad 1
row.names(class_summary_otus_per_class_16S_protists_non.norm_order)<-class_summary_otus_per_class_16S_protists_non.norm_order[,1]
class_summary_otus_per_class_16S_protists_non.norm_order<-class_summary_otus_per_class_16S_protists_non.norm_order[c(-1)]
colnames(class_summary_otus_per_class_16S_protists_non.norm_order)<-c("OTUs_per_class")
class_summary_reads_per_class_16S_bacteria_non.norm<-aggregate(rowSums(tb16_bacteria_non.norm[1:117]), list(tb16_bacteria_non.norm$class_A), sum)
# count the different groups
class_summary_otus_per_class_16S_bacteria_non.norm<-aggregate(rowSums(tb16_bacteria_non.norm[1:117]), list(tb16_bacteria_non.norm$class_A), length)
## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_bacteria_non.norm)
class_summary_reads_per_class_16S_bacteria_non.norm_order<-class_summary_reads_per_class_16S_bacteria_non.norm[order(-x),]
detach(class_summary_reads_per_class_16S_bacteria_non.norm)
class_summary_reads_per_class_16S_bacteria_non.norm_order
## Group.1 x
## 2 heterotrophic_bacteria 2883944
## 1 Cyanobacteria 2819377
#fix column names
row.names(class_summary_reads_per_class_16S_bacteria_non.norm_order)<-class_summary_reads_per_class_16S_bacteria_non.norm_order[,1]
class_summary_reads_per_class_16S_bacteria_non.norm_order<-class_summary_reads_per_class_16S_bacteria_non.norm_order[c(-1)]
colnames(class_summary_reads_per_class_16S_bacteria_non.norm_order)<-c("reads_per_class")
## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_bacteria_non.norm)
class_summary_otus_per_class_16S_bacteria_non.norm_order<-class_summary_otus_per_class_16S_bacteria_non.norm[order(-x),]
detach(class_summary_otus_per_class_16S_bacteria_non.norm)
class_summary_otus_per_class_16S_bacteria_non.norm_order
## Group.1 x
## 2 heterotrophic_bacteria 842
## 1 Cyanobacteria 31
row.names(class_summary_otus_per_class_16S_bacteria_non.norm_order)<-class_summary_otus_per_class_16S_bacteria_non.norm_order[,1]
class_summary_otus_per_class_16S_bacteria_non.norm_order<-class_summary_otus_per_class_16S_bacteria_non.norm_order[c(-1)]
colnames(class_summary_otus_per_class_16S_bacteria_non.norm_order)<-c("OTUs_per_class")
## st001 st002 st003 st004 st005
## OTU_80 102 130 2 5 56
## OTU_146 21 33 31 39 21
## OTU_170 4 21 7 15 7
## OTU_182 4 8 2 3 1
## OTU_213 0 2 1 2 0
## st001 st002 st003 st004 st005
## OTU_80 102 130 2 5 56
## OTU_146 21 33 31 39 21
## OTU_170 4 21 7 15 7
## OTU_182 4 8 2 3 1
## OTU_213 0 2 1 2 0
## [1] 31 117
## [1] 117
## [1] 117
## [1] 116
## [1] 116
## [1] 43
## [1] 63
## [1] 69
## [1] 23
## [1] 32
## [1] 62
## [1] 46
## [1] 117
## [1] 117
## reads_per_class OTUs_per_class
## Bacillariophyta 509 3
## Bolidophyceae 267 1
## Chlorarachniophyta 34 1
## Cryptomonadales 147 1
## Dictyochophyceae 8988 7
## reads_per_class OTUs_per_class
## Cyanobacteria 2819377 31
## heterotrophic_bacteria 2883944 842
## NA NA NA
## NA.1 NA NA
## NA.2 NA NA
## reads_per_class OTUs_per_class samples_per_class
## 100.0000 100.0000 394.8718
## reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae 63.94085229 48.214286 100.000000
## Dictyochophyceae 17.44158969 12.500000 97.435897
## Prasinophyceae 12.76682450 17.857143 83.760684
## Mamiellophyceae 2.77885586 7.142857 26.495726
## Bacillariophyta 0.98773578 5.357143 22.222222
## Pelagophyceae 0.95668711 1.785714 17.948718
## Bolidophyceae 0.51812466 1.785714 23.076923
## Cryptomonadales 0.28525964 1.785714 11.111111
## Rappemonad 0.25809206 1.785714 11.111111
## Chlorarachniophyta 0.06597842 1.785714 1.709402
OTUs per class vs. samples in which they occur - PROTISTS [PLOT DESCRIPTION]
Reads per class vs. OTUs per class - PROTISTS [PLOT DESCRIPTION]
OTUs per class vs. samples in which they occur - PROTISTS vs. CYANOBACTERIA [PLOT DESCRIPTION]
## reads_per_class OTUs_per_class samples_per_class
## 100.0000 100.0000 494.8718
## reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae 1.147720112 31.034483 100.000000
## Dictyochophyceae 0.313071574 8.045977 97.435897
## Prasinophyceae 0.229160869 11.494253 83.760684
## Mamiellophyceae 0.049879672 4.597701 26.495726
## Bacillariophyta 0.017729576 3.448276 22.222222
## Pelagophyceae 0.017172261 1.149425 17.948718
## Bolidophyceae 0.009300190 1.149425 23.076923
## Cryptomonadales 0.005120329 1.149425 11.111111
## Rappemonad 0.004632679 1.149425 11.111111
## Chlorarachniophyta 0.001184294 1.149425 1.709402
## Cyanobacteria 98.205028442 35.632184 100.000000
Reads per class vs. OTUs per class - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]